242 research outputs found

    GraalWeb ou accéder à une bibliothèque décentralisée de grammaires locales

    Get PDF
    International audienc

    On the analysis of locative prepositional phrases : the classifier/proper noun pairing

    Get PDF
    International audienc

    Vers la construction d'une bibliothèque en-ligne de grammaires linguistiques

    Get PDF
    National audienceLes grammaires locales sont un moyen simple et efficace de repérer et d'analyser des contraintes syntaxiques locales dans des textes. L'explosion de leur nombre et leur éparpillement géographique nous pousse à implanter un outil de gestion : une bibliothèque en-ligne de grammaires locales. Après avoir décrit leur formalisme, nous faisons un large état des lieux de l'utilisation des grammaires locales dans le cadre du réseau informel de laboratoires européens RELEX. Nous insistons principalement sur les travaux réalisés sur le français. Enfin, nous décrivons brièvement notre système de gestion de grammaires locales

    Using subcategorization frames to improve French probabilistic parsing

    Get PDF
    posterInternational audienceThis article introduces results about probabilistic parsing enhanced with a word clustering approach based on a French syntactic lexicon, the Lefff. We show that by applying this clustering method on verbs and adjectives of the French Treebank, we obtain accurate performances on French with a parser based on a Probabilistic Context-Free Grammar

    Les disfluences dans les mots composés

    Get PDF
    National audienceLes disfluences, phénomène propre à l‟oral, ont la particularité de briser la linéarité syntaxique de l‟énoncé. Les mots composés ont tendance à former des unités syntaxiques et sémantiques. Dans cet article, nous montrons que l‟énonciation de telles expressions dans un discours oral est moins propice à l‟apparition de disfluences qu‟une séquence libre de mots. Pour cela, nous avons mis au point une procédure automatique de reconnaissance probabiliste des mots composés incluant une détection itérative préalable des disfluences

    Real-time unsupervised classification of web documents

    Get PDF
    International audienceThis paper adresses the problem of clustering dynamic collections of web documents. We show an iterative algorithm based on a fine-grained keyword extraction (simple, compound words and proper nouns). Each new document inserted in the collection is either assigned to an existing class containing documents of the same topic, or assigned to a new class. After each step, when necessary, classes are refined using statistical techniques. The implementation of this algorithm was successfully integrated in an application used for Information Intelligence

    A generic tool to generate a lexicon for NLP from Lexicon-Grammar tables

    Get PDF
    International audienceLexicon-grammar tables constitute a large-coverage syntactic lexicon but they cannot be directly used in Natural Language Processing (NLP) applications because they sometimes rely on implicit information. In this paper, we introduce a generic tool for generating a syntactic lexicon for NLP from the lexicon-grammar tables. It relies on a global table that contains undefined information and on a unique extraction script including all operations to be performed for all tables. We also show an experiment that has been conducted to generate a new lexicon of French verbs and nouns

    Strategies for Contiguous Multiword Expression Analysis and Dependency Parsing

    Get PDF
    International audienceIn this paper, we investigate various strategies to predict both syntactic dependency parsing and contiguous multiword expression (MWE) recognition, testing them on the dependency version of French Treebank \cite{abeille:04}, as instantiated in the SPMRL Shared Task \cite{spmrl:st:2013}. Our work focuses on using an alternative representation of syntactically regular MWEs, which captures their syntactic internal structure. We obtain a system with comparable performance to that of previous works on this dataset, but which predicts both syntactic dependencies and the internal structure of MWEs. This can be useful for capturing the various degrees of semantic compositionality of MWEs

    A new semantically annotated corpus with syntactic-semantic and cross-lingual senses

    Get PDF
    International audienceIn this article, we describe a new sense-tagged corpus for Word Sense Disambiguation. The corpus is constituted of instances of 20 French polysemous verbs. Each verb instance is annotated with three sense labels: (1) the actual translation of the verb in the english version of this instance in a parallel corpus, (2) an entry of the verb in a computational dictionary of French (the Lexicon-Grammar tables) and (3) a fine-grained sense label resulting from the concatenation of the translation and the Lexicon-Grammar entry
    • …
    corecore